Computer-Aided Lung Nodule Recognition by SVM Classifier Based on Combination of Random Undersampling and SMOTE

نویسندگان

  • Yuan Sui
  • Ying Wei
  • Da-Zhe Zhao
چکیده

In lung cancer computer-aided detection/diagnosis (CAD) systems, classification of regions of interest (ROI) is often used to detect/diagnose lung nodule accurately. However, problems of unbalanced datasets often have detrimental effects on the performance of classification. In this paper, both minority and majority classes are resampled to increase the generalization ability. We propose a novel SVM classifier combined with random undersampling (RU) and SMOTE for lung nodule recognition. The combinations of the two resampling methods not only achieve a balanced training samples but also remove noise and duplicate information in the training sample and retain useful information to improve the effective data utilization, hence improving performance of SVM algorithm for pulmonary nodules classification under the unbalanced data. Eight features including 2D and 3D features are extracted for training and classification. Experimental results show that for different sizes of training datasets our RU-SMOTE-SVM classifier gets the highest classification accuracy among the four kinds of classifiers, and the average classification accuracy is more than 92.94%.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improvement of Chemical Named Entity Recognition through Sentence-based Random Under-sampling and Classifier Combination

Chemical Named Entity Recognition (NER) is the basic step for consequent information extraction tasks such as named entity resolution, drug-drug interaction discovery, extraction of the names of the molecules and their properties. Improvement in the performance of such systems may affects the quality of the subsequent tasks. Chemical text from which data for named entity recognition is extracte...

متن کامل

A New Computer-Aided Detection System for Pulmonary Nodule in CT Scan Images of Cancerous Patients

Introduction: In the lung cancers, a computer-aided detection system that is capable of detecting very small glands in high volume of CT images is very useful.This study provided a novelsystem for detection of pulmonary nodules in CT image. Methods: In a case-control study, CT scans of the chest of 20 patients referred to Yazd Social Security Hospital were examined. In the two-dimensional and ...

متن کامل

Improved Sampling Techniques for Learning an Imbalanced Data Set

This paper presents the performance of a classifier built using the stackingC algorithm in nine different data sets. Each data set is generated using a sampling technique applied on the original imbalanced data set. Five new sampling techniques are proposed in this paper (i.e., SMOTERandRep, Lax Random Oversampling, Lax Random Undersampling, Combined-Lax Random Oversampling Undersampling, and C...

متن کامل

Predicting credit card customer churn in banks using data mining

In this paper, we solve the customer credit card churn prediction via data mining. We developed an ensemble system incorporating majority voting and involving Multilayer Perceptron (MLP), Logistic Regression (LR), decision trees (J48), Random Forest (RF), Radial Basis Function (RBF) network and Support Vector Machine (SVM) as the constituents. The dataset was taken from the Business Intelligenc...

متن کامل

A Computer Aided Pulmonary Nodule Detection System Using Multiple Massive Training SVMs

A computer aided pulmonary nodule detection system for chest radiography is proposed. The system consists of three models, viz., lung segmentation, lung nodule candidates detection and false positive reduction. Several innovations are offered in this system. The first one is that the detection of potential lung nodule candidates is conceived as a filtering process that searches for any region w...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 2015  شماره 

صفحات  -

تاریخ انتشار 2015